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Abstract 

The 3' end of the turkey coronavirus (TCV) genome (1740 bases) including the nucleocapsid (N) gene and 3' 
untranslated region (UTR) were sequenced and compared with published sequences of other avian and mammalian 
corona viruses. The deduced sequence of the TCV N protein was determined to be 409 amino acids with a molecular 
mass of approximately 45 kDa. The TCV N protein was identical in size and had greater than 90% amino acid 
identity with published N protein sequences of infectious bronchitis virus (IBV); less than 21% identity was observed 
with N proteins of bovine coronavirus and transmissible gastroenteritis virus. The 3' UTR showed some variation 
among the three TCV strains examined, with two TCV strains, Minnesota and Indiana, containing 153 base segments 
which are not present in the NC95 strain. Nucleotide sequence identity between the 3' UTRs of TCV and IBV was 
greater than 78%. Similarities in both size and sequence of TCV and IBV N proteins and 3' UTRs provide additional 
evidence that these avian coronaviruses are closely related. © 1999 Elsevier Science B.V. All rights reserved. 
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1. Introduction 

Coronaviruses are enveloped, pleomorphic 
viruses, 80-220 nm in diameter with club shaped 
surface projections, and a positive-sense single- 
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stranded RNA genome of 20-30 kilobases (kb; 
Siddell, 1995; Holmes and Lai, 1996). The virion 
contains at least three major structural proteins, 
the surface (S) glycoprotein (90-180 kilodaltons 
(kDa)), an integral membrane (M) protein (20-35 
kDa), and a nucleocapsid (N) protein (50-60 
kDa; Siddell, 1995; Holmes and Lai, 1996). Addi¬ 
tionally, some coronaviruses also contain a fourth 
major structural protein, the hemagglutinin-es- 
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terase (HE) protein (120-140 kDa; Siddell, 1995; 
Holmes and Lai, 1996). 

Coronaviruses have been subdivided into three 
major antigenic groups based on differences iden¬ 
tified by serological analyses, and these findings 
have been substantiated by nucleotide sequence 
analyses (Pedersen, 1978; Robb and Bond, 1979; 
Wege et ah, 1982; Williams et ah, 1992). Human 
coronavirus (HCV) 229E, transmissible gastroen¬ 
teritis virus (TGEV), canine coronavirus and fe¬ 
line infectious peritonitis virus are members of 
group I, HCV OC43, murine hepatitis virus 
(MHV) and bovine coronavirus (BCV) are mem¬ 
bers of group II, and infectious bronchitis virus 
(IBV) belongs to group III. Previous studies by 
Dea et ah (1990) and Verbeek and Tijssen (1991) 
indicated that turkey coronavirus (TCV) is a 
member of group II; however, recent antigenic 
studies and nucleotide sequence analyses indicate 
that TCV belongs to group III (Guy et ah, 1997; 
Breslin et ah, 1999; Stephensen et ah, 1999). 

The N protein has been shown to be highly 
variable in size and amino acid composition be¬ 
tween the viruses that comprise the three coro¬ 
navirus antigenic groups, but highly conserved 
within these groups (Williams et ah, 1992; Siddell, 
1995). Lapps et ah (1987) examined the degree of 
amino acid identity between the BCV N protein 
and other coronaviruses; they found that BCV 
and MHV, group II coronaviruses, had 70% iden¬ 
tity, but only 29% identity with TGEV (group I) 
and IBV (group III). The N protein tends to be 
highly conserved among different strains of the 
same coronavirus (Williams et ah, 1992; Laude 
and Masters, 1995). N proteins of 27 different 
infectious bronchitis virus (IBV) strains isolated in 
USA, UK, Holland, Saudi Arabia, and Japan 
were shown to have greater than 94% identity at 
the amino acid level (Williams et ah, 1992; 
Zwaagstra et ah, 1992). 

The 3' untranslated region (UTR) of coro¬ 
naviruses is found downstream of the N gene and 
is believed to be important in initiation of nega¬ 
tive-strand RNA synthesis. This region, like the N 
gene, has been shown to be highly conserved 
among different strains of the same coronavirus 
(Collisson et ah, 1990; Williams et ah, 1993; Hsue 
and Masters, 1997). 


The avian coronaviruses, IBV and TCV, cause 
several host specific diseases of economic impor¬ 
tance. Infectious bronchitis virus is the cause of 
an acute, highly contagious respiratory disease in 
chickens with potential involvement of kidney and 
reproductive tract (Cavanagh and Naqi, 1997). 
TCV is the cause of an acute, highly contagious 
enteric disease of turkeys referred to as bluecomb 
disease (Nagaraja and Pomeroy, 1997). Early 
studies indicated that IBV and TCV were anti- 
genically unrelated based on immune electron mi¬ 
croscopy, hemagglutination inhibition and 
virus-neutralization studies (Ritchie et ah, 1973; 
Dea et ah, 1986); however, these findings are 
inconsistent with recent antigenic and nucleotide 
sequence analyses (Guy et ah, 1997; Breslin et ah, 
1999; Stephensen et ah, 1999). Antigenic analyses 
by Guy et ah (1997) indicated that IBV and TCV 
were closely related based on cross-immu- 
nofluorescent studies using both polyclonal and 
monoclonal antibodies. Subsequent studies indi¬ 
cated that IBV and TCV were closely related 
based on the nucleotide sequence of a 1.1-kb 
segment of the TCV genome spanning portions of 
both the M and N genes (Breslin et ah, 1999). 
Additionally, Stephensen et ah (1999) demon¬ 
strated a close genetic relationship between TCV 
and IBV based on sequence analysis of a highly 
conserved region of the polymerase gene. In the 
present study, the complete nucleocapsid gene and 
the 3' UTR of three epidemiologically distinct 
TCV strains were sequenced and compared with 
those of previously published IBV strains and 
representative members of groups I and II coro¬ 
naviruses, TGEV and BCV, respectively. 


2. Materials and methods 

TCV strains examined in the present study in¬ 
cluded Minnesota strain (American Type Culture 
Collection, ATCC VR-911), Indiana strain (Tom 
Hooper, Purdue University), and NC95 strain 
(Guy et ah, 1997). These viruses were isolated 
from turkeys in Minnesota in 1974, Indiana in 
1994 and North Carolina in 1995, respectively. 
TCV strains were propagated by amniotic inocu¬ 
lation of embryonated turkey eggs (Guy et ah, 
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10 20 30 40 50 60 70 80 

Minnesota MASGKATGKTDAPAPVIKLGGPKPPKVGSSGNASWFQAIKAKKLNSPPPKFEGSGVPDNENLKPGQQHGYWRRQARFKPSKGGRK 

NC95 .1.S.Q.I.TS..G. 

Indiana .SV.Q.TS. S.R.... 

Beaudette .A.T. I..S .G. 

KB8523 .L.TS... 

M41 . N...I.SS.T...G . 

90 100 110 120 130 140 150 160 170 

Minnesota PVPDAWYFYYTGTGPAADLNWGDSQDGIVWVAAKGADVKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPLNRGRSGRSTAA 

NC95 .T.. .S...iff. 

Indiana .Q.A.K.,.V 

Beaudette .T.T.. 

KB8523 .T...K_ 

M41 .N.G_T.F.S.G. 

,V>'' 180 190 200 210 220 * 230 240 250 

Minnesota SSAASTRAPSREGSRGRRSGSEBDLIAgAAKIIQDQQRKGSRITKAKADEMAHRRYCKRTIPPGYKVDQVFGPRTKGKEGNFGDDK 

NC95 .S.D..K.V. 

Indiana . S .. .A . K. . ... E .. .,:v.. . .B . 

Beaudette ....AS .D.G.K.. . . .N.R. 

KB8523 .S.A.K. .A.. 

M41 .S.V.R.K.N. 

260 270 280 290 300 310 320 330 340 

Minnesota MNEEGITDGRVTAMLNLVPSSHACLFGSRVTPKLQPDGLHLKFEFTTWPRDDPQFDNYVKICDQCIDGVGTRPKDDEPKPKSRSS 

NC95 .K.R.T. V..I.N..R _ P. 

Indiana .K.A.KL.S.V.. ..R....A. 

Beaudette .K..*.L.R.C.V. 

KB8523 .K.SRN.V. 

M41 .K.V.R. 

350 360 370 380 390 400 

Minnesota SRPATRTSSPAPRQQRSfCKEKKPKKQDDEVDKALTSAEERTNAQLEFDDEPKVINWGDSALGENEL 

NC95 .GN.P. D...N .H. 

Indiana .GN.L. D...N . 

Beaudette .GN.p.L.A_.<• . 0 .. .N.. Y.A.. 

KB8523 .P.P.T. D...N. 

M41 .GN.P.H.D...N.,Y.A. 


Fig. 1. Comparison of amino acid sequences of the nucleocapsid protein of TCV (Minnesota, Indiana and NC95 strains) and 
published sequences of IBV (Beaudette, KB8523 and M41 strains; Boursnell et ah, 1985; Sutou et ah, 1988). The positions where 
amino acids are identical are indicated as (■)• 


Table 1 

Percent sequence identity between N proteins of TCV (Minnesota, Indiana and NC95 strains) and published sequences of IBV 
(Beaudette, KB8523 and M41 strains). BCV (Mebus) and TGEV (Purdue; Boursnell et ah. 1985; Kapke and Brian, 1986; Lapps et 
ah, 1987; Sutou et ah. 1988) 



TCV 

(Minnesota) 

TCV 

(NC95) 

TCV 

(Indiana) 

IBV 

(Beaudette) 

IBV 

(KB8523) 

IBV 

(M41) 

BCV 

(Mebus) 

TCV (NC95) 

93.2% 







TCV (Indiana) 

92.7% 

92.9% 






IBV (Beaudette) 

93.2% 

93.2% 

90.5% 





IBV (KB8523) 

95.6% 

93.2% 

92.9% 

93.4% 




IBV (M41) 

93.2% 

92.4% 

91.2% 

93.9% 

93.4% 



BCV (Mebus) 

18.6% 

20.2% 

20.0% 

20.0% 

18.3% 

19.3% 


TGEV (Purdue) 

20.2% 

20.2% 

19.4% 

20.4% 

20.4% 

19.1% 

20.2% 
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10 20 30 40 50 60 70 80 

Minnesota GTAAO^TAATGGACCTGC-ATTTeTTAG-AC^TTTTGTTAAACACTATTTTTGTACTTTCCTATCAATTATTACAAGCATTGATTG 

Indiana .-_T. .G. -.GC.C...G.^.G. 

NC95 .-.4-. 

M41 ..-. 

Beaudette .TTG....C.G.T.„.C...G.G. 


90 100 110 120 130 140 150 160 170 

Minnesota TGATTTTGTTCAATATTTAAGCTTTCTTTrGGTTGCTTTTTGCTTGTTGTATTGTTGCTGTGCTTTTTATTATTGTGATT.CTCATTA 

Indiana ,.C..A. 

NC95 .-V.A. .C. . .C--. .C. . _ 

MM - .-. L .-.-. 

Beaudette .A.C.. ... .A.G... 

180 190 200 210 v, 220 5 230 240 ,.250 

Minnesota ctttgctttatcgtagaagttcaatagtaagagttaaggcagataggcatgt^gcttagc-acctacatgtctatcgccagggaaa 

Indiana G.A... 

NC95, - .C.T.T. A...T_T.A.-.... .\. 

M4l‘ ...A .,J .A.^.GATT, . .,... 

Beaudette G..i,.A..A.GATT.... G... .. 

260 270 280 290 300 310 320 330 340 

Minnesota TGTCTAATCIGTCTACCrAGTAGCCTGGAAATOAACGGTAGACCCTTAGATTTTAATTTAGTSTAATTTTTAGTTTAGTTTAAGTT 

Indiana .T... »...G .C.. 

NC95 .T... 

M41 .T. 

Beaudette .1.T.C. 


350 360 370 380 390 400 410 420 

Minnesota AGTTTAGAGTAGGTATAAAGATGCCAGTGCCGAGGCCACGCGGAGTACGATCGAGGGTACAGCACTAGGACGCCCATTAGGGGAAG 

Indiana .G.C.T... ...C . 

NC95 .i.G. ..A... 

M41 .G.C..... .. 

Beaudette .G.... 

430 440 450 460 470 480 500 510 

Minnesota AGCTAAATTTTAGTTTAAGTTAAGTTTAATTGGCTAAGTATAGTTAAAATTTATAGGCTAGTATAGAGTTAGAGC 

Indiana ..C. .G. 

NC95 ... 

M41 ... ...T ..... 

Beaudette .... 


Fig. 2. Comparison of nucleotide sequences of the 3' untranslated region of TCV (Minnesota, Indiana, and NC95 strains) and 
published sequences of IBV (Beaudette and M41 strains; Boursnell et al., 1985; Sutou et al., 1988). The positions where nucleotides 
are missing are indicated as (-) and identical nucleotides are indicated as (■)■ 


1997). Viral RNA was extracted from sucrose 
gradient-purified virus, and cDNA synthesis was 
accomplished by reverse transcriptase polymerase 
chain reaction (RT-PCR) as described (Breslin et 
al., 1999). The RT reaction was primed with an 
oligo (dT) primer 15 bases in length; this primer 
along with a specific oligonucleotide primer de¬ 
signed from the 5' end of the N gene 
(T G A ATT CT A A ATT C ACCT C A ACCTA AGT) 
(Breslin et al., 1999) was used in the PCR proce¬ 
dure. Primers possessed Eco RI restriction sites 
that facilitated cloning of cDNA into pUC19; 


clones were used to transform competent Es¬ 
cherichia coli strain DH5a (Gibco BRL) as de¬ 
scribed (Sambrook et al., 1989). DNA was 
sequenced at the University of North Carolina 
(Chapel Hill) Automated DNA Sequencing Facil¬ 
ity on a Model 373A DNA Sequencer using the 
Taq Dye Deoxy™ Terminator Cycle Sequencing 
Kit (Applied Biosystems). All sequences were 
confirmed by sequencing of both strands. Prelimi¬ 
nary nucleotide sequence data allowed the design 
of primers to amplify internal N gene and 3' UTR 
regions to confirm the nucleotide sequence. 
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Primers again were designed with Eco RI restric¬ 
tion sites, used in a RT-PCR procedure and cDNA 
fragments were cloned and sequenced as described 
above. The complete nucleotide sequence of the N 
gene and the 3’UTR of each TCV have been 
deposited in GenBank; accession numbers are 
TCV (Indiana) AF111995, TCV (Minnesota) 
AF111996, TCV (NC95) AF111997. 


3. Results and discussion 

Nucleotide sequences were entered into the 
Translate Tool of the ExPASy molecular biology 
www server of the Swiss Institute of Bioinformatics 
and used to determine possible open reading 
frames (ORF). One ORF was identified for each 
TCV strain that predicted a protein, 409 amino 
acids in length with a molecular mass of approxi¬ 
mately 45 kDa. Previous sequence analyses of the 
N proteins of IBV strains (Beaudette, M41, and 
KB8523; Boursnell et al., 1985; Sutou et al., 1988; 
Jia et al., 1995) resulted in similar findings: IBV N 
proteins were determined to be 409 amino acids in 
length with a molecular mass of approximately 45 
kDa. In contrast, groups I and II coronaviruses 
have been shown to possess N proteins of 378-389 
amino acids and 448-455 amino acids, respectively 
(Kapke and Brian, 1986; Lapps et al., 1987; Parker 
and Masters, 1990; Williams et al., 1992). 

The deduced amino acid sequences of the N 
protein of TCV is shown in Fig. 1. The TCV 
(Minnesota) strain is used as the reference strain 
with amino acid differences noted for Indiana 
strain, NC95 strain and three strains of IBV 
(Beaudette, M41, KB8523; Boursnell et al., 1985; 
Sutou et al., 1988). A comparison of the percent 
identity of TCV N protein amino acid sequences 
with published sequences of three IBV strains 
(Beaudette, M41, KB8523), BCV (Mebus) and 
TGEV (Purdue) is presented in Table 1 (Boursnell 
et al., 1985; Kapke and Brian, 1986; Lapps et al., 
1987; Sutou et al., 1988). Overall the N protein 
sequence of TCV and IBV strains have greater 
than 90% identity. TCV strains have 92.7-93.2% 
sequence identity when compared to each other, 
and 90.5-95.6% identity to IBV. In contrast, TCV 
and IBV N proteins have less than 21% identity 


with BCV (Mebus) and TGEV (Purdue). Previous 
studies comparing the TCV M protein (70 amino 
acids at carboxy-terminus) and N protein (55 
amino acids at amino-terminus) and published 
sequences from other coronaviruses indicated 
greater than 90% identity of both protein segments 
with IBV, and less than 30% identity with groups 
I and II coronaviruses (Breslin et al., 1999). 

Differences among TCV strains were observed in 
the 3' UTRs (Fig. 2). TCV (Minnesota) and TCV 
(Indiana) contained a 153-nucleotide segment that 
was not present in the NC95 strain. 3' UTRs of 
TCV (Indiana) and TCV (Minnesota) were 502 bp 
in length, compared with a 349-bp 3' UTR of TCV 
(NC95). The missing nucleotide segment in the 
NC95 3' UTR occurs immediately downstream of 
the N gene (Fig. 2). Similar differences in structure 
of 3' UTRs have been observed among IBV strains 
(Williams et al., 1993; Sapats et al., 1996). IBV 
strains (Beaudette, KB8523, CU-T2) have been 
shown to possess 3' UTRs of 503-505 nucleotides 
(Boursnell et al., 1985; Sutou et al., 1988). IBV 
(M41) was found to differ from other strains in 
that it possessed a 3' UTR of 320 bases; compared 
with other IBV strains, M41 lacked a 183-196 
nucleotide sequence occurring four bases down¬ 
stream of the N gene (Boursnell et al., 1985; Sapats 
et al., 1996). The significance of these differences in 
the 3' UTRs has not been determined (Collisson et 
al., 1990; Sapats et al., 1996). 

Comparison of 3' UTRs of TCV strains (Minne¬ 
sota, Indiana, NC95) and published sequences of 
three IBV strains (Beaudette, KB8523, M41) 
demonstrated a sequence identity greater than 78% 
(data not shown). TCV strains had a 90.8-96.0% 
sequence similarity when compared to each other, 
and 78.5-94.4% similarity to IBV. TCV 3' UTRs 
had less than 30% similarity with those of BCV 
(Mebus) and TGEV (Purdue). In these compari¬ 
sons the large deletions in the 5' portion of the 3' 
UTR of TCV (NC95) and IBV (M41) were 
counted as single differences, rather than consider¬ 
ing each missing base as a separate difference. 

Williams et al. (1993) previously compared 3' 
UTR sequences of several different IBV strains, 
comparing them in two sections: the 5' region not 
found in IBV (M41) (184-196 bases) and the 
remaining bases downstream of this sequence. 
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Results of the study indicated that the 5' end of 
the IBV 3' UTR is quite variable between strains, 
ranging from 53.2 to 92.8% identity. In contrast, 
the sequence downstream of this hypervariable 
region was highly conserved with 94.3-97.8% 
identity (Williams et al., 1993). In the present 
study we performed similar comparisons with the 
TCV 3' UTR sequence data (data not shown). 
The 5' region of the 3' UTR (153 bases) of TCV 
(Minnesota) and TCV (Indiana) had 94.7% iden¬ 
tity with each other and 57.8-90.4% identity with 
IBV strains (Beaudette, KB8523). The remaining 
bases of the 3' UTR of TCV strains (Minnesota, 
Indiana, NC95) had 96.5-97.8% sequence identity 
when compared to each other, and 91.7-95.2% 
identity to IBV strains (Beaudette, KB8523, 
M41). 

In summary, the amino acid sequence of TCV 
N protein and the nucleotide sequence of the 3' 
UTR were compared with published sequences of 
other avian and mammalian coronaviruses. The 
size and sequence characteristics of the TCV N 
protein and 3' UTR closely resembled those of 
IBV strains, thus supporting previous antigenic 
analyses and nucleotide sequence studies that in¬ 
dicated a close relationship between TCV and 
IBV (Breslin et al., 1999; Guy et al., 1997; 
Stephensen et al., 1999). Together, these findings 
refute previous studies that indicated a close rela¬ 
tionship between TCV and group II coronaviruses 
(Dea et al., 1990; Verbeek and Tijssen, 1991). 
These findings instead indicate that the avian 
coronaviruses, IBV and TCV, share a close phylo¬ 
genetic relationship and together comprise group 
III of the coronavirus major antigenic groups. 
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